<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Student's (1906) Yeast Cell Counts</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" type="text/css" href="R.css" />
</head><body>

<table width="100%" summary="page for Yeast"><tr><td>Yeast</td><td style="text-align: right;">R Documentation</td></tr></table>

<h2>
Student's (1906) Yeast Cell Counts
</h2>

<h3>Description</h3>

<p>Counts of the number of yeast cells were made each of 400 regions in a 20 x 20 grid on a microscope
slide, comprising a 1 sq. mm. area.
This experiment was repeated four times, giving samples A, B, C and D.
</p>
<p>Student (1906) used these data to investigate the errors in random sampling.
He says &quot;there are two sources of error: (a) the drop taken may not be representative
of the bulk of the liquid; (b) the distribution of the cells over the area 
which is examined is never exactly uniform, so that there is an 'error of
random sampling.'&quot;
</p>
<p>The data in the paper are provided in the form of discrete frequency distributions
for the four samples.  Each shows the frequency distribution squares containing
a <code>count</code> of 0, 1, 2, ... yeast cells. These are combined here in <code>Yeast</code>. 
In addition, he gives a table
(Table I) showing the actual number of yeast cells counted in the 20 x 20
grid for sample D, given here as <code>YeastD.mat</code>.
</p>


<h3>Usage</h3>

<pre>
	data(Yeast)
	data(YeastD.mat)
	</pre>


<h3>Format</h3>

<p><code>Yeast</code>: A frequency data frame with 36 observations on the following 3 variables,
giving the frequencies of 
</p>

<dl>
<dt><code>sample</code></dt><dd><p>Sample identifier, a factor with levels <code>A</code> <code>B</code> <code>C</code> <code>D</code></p>
</dd>
<dt><code>count</code></dt><dd><p>The number of yeast cells counted in a square</p>
</dd>
<dt><code>freq</code></dt><dd><p>The number of squares with the given <code>count</code></p>
</dd>
</dl>

<p><code>YeastD.mat</code>: A 20 x 20 matrix containing the count of yeast cells in each square for
sample D.  
</p>


<h3>Details</h3>

<p>Student considers the distribution of a total of <i>Nm</i> particles distributed over
<i>N</i> unit areas with an average of <i>m</i> particles per unit area.
With uniform mixing, for a given particle, the probability of it falling on any one
area is <i>p = 1/N</i>, and not falling on that area is <i>q = 1 - 1/N</i>.
He derives the probability distribution of 0, 1, 2, 3, ...
particles on a single unit area from the binomial expansion of <i>(p + q)^{mN}</i>.
</p>


<h3>Source</h3>

<p>D. J. Hand, F. Daly,  D. Lunn, K. McConway and E. Ostrowski (1994). 
<em>A Handbook of Small Data Sets</em>.
London: Chapman \&amp; Hall.
The data may be found at:
<a href="https://www2.stat.duke.edu/courses/Spring98/sta113/Data/Hand/yeast.dat">https://www2.stat.duke.edu/courses/Spring98/sta113/Data/Hand/yeast.dat</a>
</p>


<h3>References</h3>

<p>&quot;Student&quot; (1906)
On the error of counting with a haemocytometer.
Biometrika, 5, 351-360.
<a href="http://www.medicine.mcgill.ca/epidemiology/hanley/c626/Student_counting.pdf">http://www.medicine.mcgill.ca/epidemiology/hanley/c626/Student_counting.pdf</a>
</p>


<h3>Examples</h3>

<pre>
data(Yeast)

require(lattice)
# basic bar charts 
# TODO: frequencies should start at 0, not 1.
barchart(count~freq|sample, data=Yeast, ylab="Number of Cells", xlab="Frequency")
barchart(freq~count|sample, data=Yeast, xlab="Number of Cells", ylab="Frequency",
	horizontal=FALSE, origin=0)

# same, using xyplot
xyplot(freq~count|sample, data=Yeast, xlab="Number of Cells", ylab="Frequency",
	horizontal=FALSE, origin=0, type="h", lwd=10)
</pre>


</body></html>
